Skip to main content

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: aibek Date: Sep 4, 2012 8:49pm
Forum: forums Subject: Re: Problem with epub downloads of sanskrit books

The ‘plain text’ and ‘Epub’ files are prepared by running an Optical Character Recognition software on the scanned images. As far as I know, the Internet Archive runs the English version for all the books it has. (i.e., it assumes all the books are in English.) Thus, only English books would have decent epub files.

The process of running OCR for the language of your choice is trivial. I am sure that eventually the Internet Archive would have proper character recognition for Sanskrit and Hindi. Check your Epub files ten years hence!